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Voice Recording Methods and Systems 
Field of the Invention 

The present invention relates to methods and systems for recording a 
conversation or a message. 

5 Summary of the Invention 

In general terms the present invention proposes that individuals hold 
conversations, or leave messages for each other, using a communication 
system which records at least their voices. The users are permitted to 
annotate the recordings with tags indicating points or portions of the 
10 recordings having particular meanings. 

It is particularly advantageous if the locations where the messages or 
conversations are stored are accessible to multiple individuals (e.g. the 
individual(s) who recorded them, and/or other individuals), i.e. they are 
"shared". 

15 Brief description of the Figures 

Non-limiting embodiments of the invention will now be described, for the sake 
of example only, with reference to the following figures, in which: 

Fig. 1 shows a first embodiment of the invention; and 

Fig. 2 shows a second embodiment of the invention. 

20 Detailed Description of the embodiments 

Referring to Fig. 1, a first embodiment of the invention is shown schematically 
including first and second telephone communication devices 1, 3. These are 
drawn as mobile phones having a screen 13, but the invention is not limited in 
this respect, and the invention is applicable to any telephone devices, 
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including video telephones in which the screen of the communication devices 
includes an image of the user of the second communication device 3. 
Alternatively, they may be computer apparatus such as PCs or Net terminals 
with a microphone and telephone compatibility. 

5 Furthermore, the telephone devices may be any future system which 
transmits in addition to* a voice signal (and optionally video signal) other data, 
e.g. streamed with the voice signal. For example, the other data may be text 
words, such as words which visually represent what either individual says. 

The two communication devices 1, 3 communicate via a communication 
10 network indicated by reference numeral 5, which may be of any form, such as 
an existing public telephone system (usually referred to as a PSTN). The 
connections between the communication devices 1, 3 and the network 5 are 
indicated as lines 7, 9, but while it is possible for this connection to be made 
by fixed lines such as electrical cables or optical fibre, the connections to the 
15 network 5 may equally be of any other known or future form, such as wireless 
(e.g. radio) connections. 

Optionally, either or both of the units 1, 3 may be units of respective 
communication networks (e.g. mobile phone networks) operated by respective 
exchanges which are not shown explicitly in Fig. 1. These exchanges are 
20 connected to the public telephone system 5, and are subsumed in the 
connections 7, 9. 

The communication network 5 is also connected to a computer system 11 
(e.g. server) having a storage facility. Preferably either user of the 
communication devices 1, 3 can choose whether or not to enable the 
25 computer system 11, that is to place the computer system into a state in 
which it is party to the conversation. The enablement of the computer system 
can be done at the time when the conversation is initiated, or optionally at any 
point during the conversation. Optionally, either of the users can also disable 
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the computer system at any time, so that it is not party to the conversation. In 
the case that they do decide enable the computer system 1 1 , the computer 
system 11 is supplied with information identifying the devices 1, 3 and/or their 
users. The identification of the user(s) and/or device(s) to the computer 
5 system 11 may include accessing an account for one or both of the users 
and/or devices maintained at the system 11, by an identity verification 
procedure of a conventional kind. 

Preferably, when the computer system 11 is in its enabled state, the users are 
able to indicate to the computer system 11 which portion(s) of the telephone 
10 conversation it should record. For example, at any point in the conversation 
either of the users may be able to transmit a "record" instruction or a 
"terminate recording" instruction to the computer system 11 to initiate or 
terminate recording. There is preferably no limit on the number of portions of 
telephone call the computer system 1 1 may record. 

15 Optionally, the computer system may make two separate recordings of the 
conversation. Each of these recordings may be made under the control of a 
respective one of the users, such that each user indicates to the computer 
system which portions of the conversation to include in his own recording. 

Note that the system may be arranged such that the computer system 1 1 is 
20 enabled for all conversations (e.g. all conversations involving a given user), 
and/or that (e.g. as a default state) it is set to record all of each conversation 
for which it is enabled. 

The users of devices 1, 3 then carry out a conversation. The computer system 
11 receives the entire conversation, and stores a recording of it. In the case 
25 that the conversation includes video telephony, the recording preferably 
includes a recording of the video portion as well as a recording of the audio 
(voice) portion. The recording is stored within the computer system 11 in 
association with indexing data including the received identity of the user(s) 
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and/or the device(s) 1, 3. The indexing data further includes the time and 
date of the conversation. 

The computer system 1 1 is adapted to add one of a predetermined set of tags 
to the recording under the control of either or both of the users. That user, or 
5 those users, can control the computer system 1 1 to add those tags during the 
ongoing conversation ("on the fly"), and/or after the conversation is finished 
(e.g. at a time when the user reconnects to the computer system 11, and 
possibly completes an additional self-identification procedure as described 
above, before accessing the recording using the indexing data to identify it). 

10 Optionally, each of the tags may be one audio tone, or a sequence of audio 
tones, inserted or overlaid onto the recording of the conversation. For 
example, each audio tone may be a DTMF code associated with a respective 
one of the keys 15. A user can add a tag which is a single DTMF tone by 
keying the respective key, or a tag which is a plurality of tones by keying the 

15 corresponding sequence of tags. 

Each tag has a respective meaning, and the tags are identifiable automatically 
(e.g. in the case that the tags are DTMF tones, well-known technology exists 
to identify them automatically). The users of devices 1 , 3 (and/or anyone else 
having an access status recognised by the computer system) may extract the 
20 recording and replay it. At this stage, the information stored by the tags is of 
value. 

. For example, when the recording is re-played using one of the telephones 1, 
3, a message may be displayed on the screen indicating the meaning of any 
tag which is encountered. Furthermore, when the recording is re-played, the 
25 telephones 1, 3 may actually reproduces the tones using their sounders, so 
that the user may recognise their meanings for himself. 
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Some possible tags might have the respective meanings of (i) the beginning 
or (ii) the end of business negotiations, (iii) the beginning or (iv) the end of 
discussions concerning transport arrangements, etc. Other examples of 
possible tag meanings will be clear from later portions of the present text. 

5 Furthermore, the mode of replaying may be modified according to the tags. 
For example, a user listening to the recording may have the option to jump at 
any moment to the next tag (or to the next tag of any given type(s)). 

Furthermore, any recording may be edited (within the computer system 11, or 
after the recording has been extracted from the computer system, optionally 
10 leaving a copy of the recording there) based on the tags. 

For example, the recording may be transformed into a second recording 
which, when played, omits sections delineated by pairs of the tags of certain 
type(s). This editing is preferably non-destructive, such that the portions of the 
first recording which are omitted when the second recording is played, are 
15 merely "hidden" and can be restored on demand. 

In a further example, the tags may be used to enhance a presently existing 
editing technique, such as one which eliminates silences, or detects changes 
in the speaker. This may be done for arranging for the tags to have meanings 
associated with those functions, e.g. a tag indicating the start or end of a 
20 silence, or a tag indicating a change of speaker. 

A further example is that the tags can be used collectively to generate further 
annotation. For example, the recording can be reviewed automatically to 
identify regions of interest or "value" based on the observation of predefined 
patterns of tag usage. For example, regions of the recording containing tags 
25 with a statistical frequency above a certain coefficient (or simply of higher than 
average statistical frequency) can be labelled as interesting. The very 
presence of certain sorts of tags may be enough to influence this annotation 




by "value", e.g. there can be a tag meaning "high value" and/or a tag meaning 
"low value". 

Note that, whereas tags are preferably associated with exact points in the 
recording, or portions of the recording with well-defined ends set by the tags, 
5 the "value" parameter may be defined continuously over some or all of the 
recording, for example varying according to the distance to the nearest tag(s) 
of certain type(s). 

Subsequently, the editing procedures described above can be performed 
based on the assigned "value". For example, passages of low value may be 
10 omitted or hidden, and/or passages of high value may be transmitted to 
specified individuals. Furthermore, portions of high "value" may be stored (e.g. 
in the computer system 11) at a preferential compression rate, or selected for 
automatic summarisation. 

Note that the editing procedure may include automatically removing some or 
1 5 all of the tags (e.g. the tags of given type(s)). 

Preferably, the annotated recordings created by the first embodiment can be 
forwarded to other individuals, or portions of them defined by the tags may be 
forwarded. 

Although the invention has been explained above in relation to a 
20 conversation, any recording may also be a message left in the computer 
system 1 1 by a single user with the tags (added at the time or subsequently) 
providing annotations of the messages. The messages are for subsequent 
retrieval by one or more other users specified by data associated with the 
message. For example, the owner of communication device 1 may access the 
25 computer system 1 1 and leave a message annotated with tags of a plurality of 
types for subsequent retrieval by the owner of communication device 3. 
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It is particularly convenient if the system 1 1 is one, such as the exchange of a 
mobile telephone network, which also stores messages without tags, and 
conventional email messages. 

Preferably, users (with appropriate access status) are able to access the 
5 computer system 11 not only via telephones but using computers such as 
PCs. More generally, the access to the computer system 1 1 may be using 
browser software. 

Any device having a screen (e.g. the PC or the phones 1 , 3) may also be able 
to access the computer system 11 and see a visual representation of a given 
10 recording, for example as a timeline having icons of types corresponding to 
the types of respective tags. The icons are in an order corresponding to the 
order of the corresponding tags in the recording. They may be equally spaced 
along the timeline, or be at locations along the timeline spaced corresponding 
to the spacing of the corresponding tags in the recording. 

15 Note that it is not necessary that both of the "users" of devices 1, 3 are 
human. Rather, the invention can usefully be employed when one of users is 
a machine, generating machine-generated voice signals (e.g. computationally 
or by playing a predetermined recording) operating a telephone device which 
is simply an interface between the machine and the communication network. 

20 In this case the "conversation" between the users may have little or no 
information passed from the human user: it may for example consist of the 
human user phoning the machine to establish the communication and then 
annotating sounds automatically generated by the machine. 

A second embodiment of the invention is shown in Fig. 2. Whereas in the first 
25 embodiment, the computer system 11 was not especially associated with 
either of the users (but rather has its own operator, such as the operator of 
the network 5), in the embodiment of Fig. 2, the computer system 17 is 
associated with the communication device 1. Indeed, although Fig. 2 shows 
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them as separate but connected units, computer system 17 may be physically 
part of the communication device 1. 

Note that in the case described above in which the communication device 1 is 
part of a communication network operated by an exchange (not shown) which 
5 communicates with the network 5, the computer system 17 may be connected 
either directly to the unit 1 or to the exchange associated with the unit 1. Fig. 2 
is intended to cover both of these two cases. 

We now discuss two scenarios in which an embodiment of the invention is 
used. In the following description the reference numerals used are those of 
10 the first embodiment of the invention, but the second embodiment would also 
be suitable. 

A first scenario concerns an individual Andrea, the owner of mobile telephone 
1, who is working away from her office. Andrea checks her e-mails using a 
PC, and finds that an individual Paul has sent Andrea three annotated phone 
15 conversations created by the first embodiment of the invention. Andrea skims 
through the conversations she has been sent. 

The next day, she uses her mobile phone 1 to call the Los Angeles Police 
Department to arrange for two officers to marshal traffic at a location the 
following week. During the conversation, which is recorded by the computer 

20 system 11, she is given a reference number and a contact phone number, 
together with a list of details to get back with. She flags all these points on the 
fly by pressing keys 15 (which adds DTMF tones to the recording) and saves 
the conversation in the system 1 1 . The tags may be tags which specify that a 
phone number is present, or alternatively tags which do not have this specific 

25 meaning. 
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She then uses her phone 1 calls up the tourist office at Big Sur and gets a list 
of hotels in the area. As she talks, she uses the keys 15 to signal to the 
computer system 11 to flag the phone numbers of several suitable hotels. 

She then contacts the system 11 directly (which may be done simply by 
5 phoning a certain number) and leaves a short message on the computer 
system 11 to be read by another individual Duncan. This message is attached 
to an annotated copy of a phone conversation she had with the client, and 
forwarded to Duncan. She labels one short portion of the message as 
particularly important, by placing respective kinds of tags at either end of it. 

10 Andrea remembers a previous conversation with a colleague about 
restaurants. She accesses the conversation by connecting to the computer 
system 1 1 on her mobile telephone 1 and skips to a point tagged with a tag 
associated with "entertainment", where a certain restaurant was mentioned. 
She notes the phone number then makes a reservation for that night. 

15 After dinner, Andrea spends 30 minutes editing her files of phone 
conversations. She does this by going through and inserting respective kinds 
of tags to indicate portions of different meanings, automatically determining 
the interest value at each point, and then automatically erasing the parts for 
which the value indicates that they are of little interest. She copies several 

20 phone numbers into her SIM card. Finally, she calls her mother for a chat. Her 
mother gives Andrea her brother's temporary address, which Andrea flags 
within the record of the call stored on the computer system 1 1 . 

The second scenario concerns an individual Duncan 

On a given day, Duncan uses his telephone 1 to assess the system computer 
25 11, and skims through a message left by Andrea the previous day. It contains 
an annotated conversation with a client showing disagreement over the job 
budget. Duncan needs to follow this problem up. 
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His assistant Paul accesses the system computer 11, goes through the 
history of communications with the client, and sets up a meeting for that 
afternoon. Paul copies Duncan the relevant correspondence, e-mails and a 
phone message containing several forwarded audio clips from computer 
5 system 1 1 . 

When Duncan skims through the clips using the tags as reference points, he 
finds confirmation of the terms that were agreed on Andrea's budget. Duncan 
asks Paul to record and annotate the meeting using his a microphone and his 
mobile phone 3 to transfer the recording of the meeting made by the 
10 microphone to the computer system 1 1 . 

Duncan has an important meeting at 11.00am with a potential client. To help 
prepare for this, Paul has accessed an audio file stored in the computer 
system 1 1 in which Andrea makes a presentation to a different client. 

He also forwards one of the files to the mobile phone of the first client. The 
15 first client listens to the presentation and agrees he would like Andrea to be 
part of a project they are collaborating on. 

Duncan then has a meeting with the first clients to discuss the budget. 
Duncan reminds the client of various items of correspondence, and clears up 
any ambiguity by playing a clip that Paul has retrieved from the computer 
20 system 1 1 earlier. 

Before going to bed, to remain on top of a scheduling problem, Duncan leaves 
a message to himself on the computer system 11 in the form of a long, 
annotated list of urgent actions, each given a tag of a sort indicating its 
importance level. He forwards a copy to Paul's mobile phone. 

25 The next day, Duncan has a meeting at a client's office in San Francisco. 
Duncan knows that the computer system 11 is storing some records of the 
early brainstorming sessions. Paul had recorded and annotated these 
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sessions. Duncan refers to his diary to find the date and time of these 
sessions. With this information he can locate the relevant recordings by 
accessing the computer system 11 on his mobile phone. To access the 
computer system 11, he enters his user-name and password then locates the 
recordings, one by one. He skims through the first session, jumping from tag 
to tag until he finds a 'magic moment'. 
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Claims 

1 . A method of communication between at least two individuals including: 
the two individuals conversing using respective telephone 

communication devices, a recording being made of at least part of the 
5 conversation; 

at least one of the individuals associating one or more tags with 
selected respective points or portions of the recording, each tag being 
automatically interpretable and indicating a meaning of the respective point or 
portion of the recording; and 
10 storing the recording and tags in a location accessible by at least one 

of the two individuals. 

2. A method according to claim 1 in which the location is accessible to 
both of the two individuals. 

15 

3. A method according to claim 1 in which the location is accessible to 
individuals other than the said two individuals. 

4. A method according to any of claims 1 to 3 in which one of the 
20 individuals is a machine generating voice signals automatically. 

5. A method according to any of claims 1 to 3 in which the tags are 
selected from a predetermined plurality of possible tags. 

25 6. A method of communicating a message from a first individual to a 

second individual, including: 

the first individual using a telephone communication device and a 
telecommunications network to transmit a recording of a message for the 
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second individual to a storage location accessible at least by the second 
individual; 

the first individual or the second individual associating one or more 
tags, each selected from a plurality of predetermined tag types, with selected 
5 respective points or portions of the recording, each tag being automatically 
interpretable and indicating a meaning of the respective point or portion of the 
recording; and 

storing the tags in said location. 

10 7. A method according to claim 6 in which the first individual is a machine 
generating voice signals automatically. 

8. A method according to any preceding claim in which the association of 
tags with the portions of the recording is performed using at least one of the 

15 communication devices, the possible tags being associated with respective 
keys of that communication device and the tags being selected by selecting 
the respective keys. 

9. A method according to claim 6 in which the recording is recorded as an 
20 audio track, and the tags are DTMF tones added to the audio track. 

10. A method according to any preceding claim in which the association of 
at least one of the tags is performed while the conversation is still proceeding. 

25 11. A method of reviewing the recording produced by a method according 
to any preceding claim, the method including automatically locating the 
portions of the recording using the tags and reviewing sections of the 
recording determined by the tags. 
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12. A method according to claim 1 1 which includes displaying a visual 
representation of the conversation including symbols indicating locations of 
the tags within the recording. 

5 13. A method of processing the recording produced by a method according 
to any of claims 1 to 10, the method including automatically locating the 
points or portions of the recording using the tags and processing the recording 
based on the meaning of the tags. 

10 14. A method according to claim 13 in which said processing includes 
selecting at least one segment of the recording based on the tags, and 
generating an edited version of the recording including or excluding the at 
least one segment. 

15 15. A method according to claim 13 in which said processing includes 
using the tags to determine, for differing sections of the recording, differing 
values of an interest parameter indicating the interest of those sections of the 
recording. 

20 16. A communication system including: 

at least two telephone communication devices; 
a communication network for communicating between the 
communication devices; 

a recording device accessible using the communication devices, the 
25 recording device being for recording a conversation between the 
communication devices; and 

means for associating one or more tags with selected respective 
portions of a recording recorded by the recording device. 

30 17. A communication system including: 
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at least two telephone communication devices; 

a communication networkfor cornmunlcating between the 
communication devices; 

a recording device accessible using the communication devices, the 
recording device being for recording a message left by one of the 
communication devices for retrieval by another of the communication devices; 
and 

means for associating one or more tags, each tags being a selected 
one of a plurality of types, with selected respective portions of the message 
recorded by the recording device. 

18. A communication system according to claim 16 or claim 17 in which the 
recording device is associated with an operator of the communication network 
and is remote from the communication devices. 

19. A communication system according to claim 16 or claim 17 in which the 
recording device is associated with one of the communication devices, and is 
proximate or connected to that communication device. 

20. A communication system according to any of claims 16 to 19 in which 
the communication devices are video telephone devices. 
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